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Abstract 


This  report  summarizes  research  carried  out  at  the 
Thomas  J.  Watson  Research  Center  in  three  areas  of  compu¬ 
tational  linguistics.  These  are  1)  the  design  and  development 
of  a  transformational  grammar  for  a  subset  of  grammatical 
sentences  in  English,  2)  the  implementation  of  this  grammar 
in  terms  of  a  sentence  synthesizing  program  written  in  LISP 
4.  5,  and  3)  the  use  of  sentence  synthesizing  programs  for 
transformational  grammars  generally. 

The  transformational  grammar  described  provides  a 
semantically  interpretable  deep  structure  and  a  phonologically 
interpretable  surface  structure  for  a  set  of  English  sentences. 
Surface  structures  are  derived  from  deep  structures  by  trans¬ 
formational  rules.  The  sentence  types  characterized  by  the 
grammar  include  noun  phrase  complementation,  verb  phrase 
complementation,  relative  clauses,  two  types  of  question  sen¬ 
tences,  indirect  object  and  prepositional  phrase  constructions, 
passives,  aspectual  constructions,  and  certain  types  of  nega¬ 
tion  phenomena. 

The  sentence  synthesizing  program  provides  a  facility 
for  generating  deep  structures  and  surface  structures.  Fur¬ 
ther,  it  makes  use  of  prototype  fast- fail  procedures  which,  in 
many  cases,  obviate  either  completely  or  partially  the  so-called 
proper  analysis  test  for  the  applicability  of  transformations. 

Observations  on  the  use  of  the  sentence  synthesizing 
program  include  1)  an  analysis  of  the  results  obtained  in  using 
the  program  as  a  device  for  evaluating  the  descriptive  ade¬ 
quacy  of  the  grammar  and  2)  a  discussion  of  the  methodological 
limitations  imposed  upon  the  use  of  the  program  by  factors 
inherent  in  the  linguistic  subject  matter. 
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1.0  INTRODUCTION 

This  paper  is  a  report  on  basic  research  activities 
directed  toward  the  specification,  development,  and  computer 
utilization  of  grammatical  descriptions  consistent  with  the 
advances  in  linguistic  theory  i"ually  subsumed  under  the  ru¬ 
bric  of  transformational  linguistics.  Our  primary  concerns 
have  been  1)  the  specification  of  a  descriptively  adequate 
transformational  grammar  for  English,  and  2)  the  develop¬ 
ment  of  prototype  computational  procedures  for  testing  the 
descriptive  adequacy  of  transformational  grammars  of  ad¬ 
vanced  design.  In  addition  to  a  discussion  of  these  topics, 
this  report  gives  brief  attention  to  our  experiences  in  using 
the  prototype  sentence  synthesizing  program  (SSP)  and  to 
program  design  considerations  arising  from  our  experiences. 

2.  0  THE  IBM  ENGLISH  GRAMMAR  I1 
2.  1  General  Properties 

In  its  linguistic  essentials,  the  IBM  English  Grammarl 

(Grammar  I)  conforms  to  the  most  important  of  the  recent 

2 

theoretical  discoveries  in  transformational  linguistics.  In 
particular,  sentences  in  English  characterized  by  Grammar  I 
are  assigned  two  levels  of  representation;  they  are  assigned  a 
deep  structure  and  a  surface  structure.  Deep  structures;  gen¬ 
erated  by  context-free  rewriting  rules,  determine  the  seman¬ 
tic  interpretation  of  sentences.  Deep  structures  are  mapped 
into  surface  structures  by  transformational  rules.  Thus,  all 
surface  structures  derived  from  a  common  deep  structure 
through  the  application  of  transformational  rules  are 
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synonymous  necessarily.  Deep  structures,  surface  structures, 
and  intermediate  transformationally  derived  structures  are 
formally  represented  by  labelled  bracketings  known  as 
P-markers."* 

Deep  structures  are  composed  of  categorial  symbols 
such  as  S,  NP,  VP,  (sentence,  noun  phrase,  verb  phrase)  and 
lexical  items  consisting  of  a  phonological  distinctive  feature 
matrix  (abbreviated  in  Grammar  I)  and  a  syntactic  feature 
vector  specifying  various  inherent,  distributional,  and  rule 
determining  properties  of  particular  lexical  entries.  (A  sam¬ 
ple  deep  structure  is  provided  in  Appendix  I.  )  Categorial 
symbols  are  introduced  by  context-free  rewxiting  rules  (Cf. 
Appendix  II  for  the  rewriting  rules  of  Grammar  I)  and  lexical 
items  are  introduced  into  P-markers  subject  to  various  semi- 
transformational  distributional  constraints.  (Cf.  Appendix  III 
for  a  sample  lexicon.  ) 

The  domain  of  a  particular  transformational  rule  is 
provided  in  terms  of  conditions  on  P-markers.  Any  P-marker 
or  set  of  P-markers  meeting  the  conditions  imposed  by  a  par¬ 
ticular  rule  falls  under  the  domain  of  that  rule.  By  way  of 
clarification,  consider  the  following  hypothetical  transforma¬ 
tional  rule. 

(1)  B  +  C  D  X 

1  2  3====> 

1  0  3 

The  numbered  sequence  of  elements  comprising  the 
first  two  lines  of  the  above  rule  (referred  to  as  a  structure 
index)  defines  the  set  of  P-markers  which  may  undergo  the 
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transformational  alteration  stated  in  the  third  line  of  the  rule. 
The  structure  index  can  be  interpreted  as  asserting  that  any 
terminal  string  (last  line  of  a  P-marker)  falls  under  the  do¬ 
main  of  the  transformation  (1)  just  in  case  it  can  be  com¬ 
pletely  segmented  into  three  consecutive  substrings  such  that 
the  first  is  a  (member  of  the  constituent  or  category  sequence) 
B  +  C,  the  second  is  a  D,  and  the  third  is  anything  at  all.  The 
diagram  (2)  contains  a  P-marker  which  falls  under  the  domain 
of  the  transformation  (1). 

(2) 


The  terminal  string  of  (2),  i.  e.  ,  E-F-C-G  H  -  M  -  N, 
can  be  segmented  in  such  a  way  that  the  conditions  stipulated 
by  the  structure  index  are  met.  Transformational  rules  stated 
in  this  fashion  have  the  power  of  variable  reference  since  each 
structure  index  characterizes  a  variety  of  P-markers.  For 
example,  the  transformation  (1)  would  be  defined  on  the  P- 
marker  (2)  regardless  of  the  constituency  dominated  by  B.  If 
the  phrase  structure  component  from  which  this  P-marker 
was  constructed  contains  the  rule  B---->  E  +  L  +  S,  then  an 
infinite  number  of  P-markers  are  provided  (since  S,  the  sen¬ 
tence  node,  is  recursive),  all  of  which  will  fall  under  the  do¬ 
main  of  the  transformation  (1).  Conditions  imposed  by  trans¬ 
formational  rules  on  P-markers  include  conditions  on  syntac¬ 
tic  feature  composition  as  well  as  on  phrase  structure.  For 
instance,  for  the  transformation  (3),  a  P-marker  falls  under 
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its  domain  just  in  case  it  contains  some  constituent  L  which 
dominates  a  syntactic  feature  vector  containing  the  feature 
{ +human). 

(3)  X  [<+human>]  Y 

Lj 

1  2  3====> 

1  0  3 

Transformational  rules  often  involve  two  special  types 
of  restrictions.  The  first  of  these  is  dominance  where  some 
subtree  in  a  P-marker  must  either  have  a  certain  analysis  or 
must  dominate  some  particular  subtree.  The  second  type  of 
restriction  is  identity  where  a  subtree  must  be  identical  or  not 
identical  to  some  other  subtree. 

A  transfoimational  rule  specifies  a  finite  sequence  of 
formal  operations  called  elementary  transformations.  The 
elementary  transformations  employed  in  Grammar  I  are  as 
follows: 

1.  Substitution,  where  one  subtree  is  substituted  for 
another  subtree. 

2.  Deletion,  where  a  subtree  is  deleted. 

3 .  Sister  Adjunction,  where  a  subtree  is  introduced 
under  the  immediate  domination  of  some  constituent  which 
immediately  dominates  at  least  one  other  constituent. 

4.  Daughter  Adjunction,^  where  some  subtree  is  in¬ 
troduced  under  the  immediate  domination  of  some  consti¬ 
tuent  which  does  not  dominate  any  other  constituent. 

Transformational  rules  are  ordered  and  are  either 
cyclic^  or  post-cyclic.  Cyclic  rules  apply  to  a  lowest  S  in  a 
deep  structure  where  a  lowest  S  is  defined  as  any  sequence  of 
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terminal  symbols  1)  bounded  left  and  right  by  sentence  bound¬ 
aries  (#),  2)  analyzable  as  an  S,  and  3)  not  containing  any 

sentence  boundaries  except  for  those  mentioned  above.  The 
final  cyclic  transformation  deletes  the  sentence  boundaries 
and,  if  the  lowest  S  condition  is  still  met,  that  is,  if  the  sen¬ 
tence  to  which  the  cyclic  rules  were  applying  is  an  embedded 
sentence,  the  set  of  cyclic  transformations  reapplies.  Other¬ 
wise,  the  set  of  post-cyclic  transformations  applies  to  the 
highest  S,  namely,  one  not  dominated  by  S.  The  full  set  of 
cyclic  and  post- cyclic  transformations  of  Grammar  I  is  given 
in  Appendix  IV. 

2.2  IBM  English  I 

IBM  English  I  (English  I)  is  the  subset  of  English  sen¬ 
tences  generated  by  Grammar  I.  Although  the  physical  di¬ 
mensions  of  Grammar  I's  rewriting  rules  are  small  (Cf.  Ap¬ 
pendix  II),  the  system  is  relatively  powerful.  This  power 
stems  from  the  recursiveness  of  the  initial  symbol,  S,  under 
the  domination  of  the  verb  phrase  (VP),  the  noun  phrase  (NP), 
and  the  determiner  (DET).  The  expansions  of  VP  are  of  par¬ 
ticular  interest. 

(4)  VP - >  V  S 

(5)  VP---->  V  NP  S 

These  expansions  provide  a  deep  structure  characterization 
for  the  syntactic  phenomenon  of  intransitive  and  transitive 
verb  phrase  complementation  respectively.  Transformed, 
these  expansions  give  rise  to  the  surface  structures  of  sen¬ 
tences  like  (6)  and  (7). 
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(6)  John  condescended  to  play  ball. 

(7)  John  tempted  Bill  to  play  ball 

Of  equal  importance  are  the  two  types  of  noun  phrase 
complementation  which  arise  through  the  expansions  of  NP 
given  in  (8)  and  (9). 

(8)  NP---->  DET  N  S 

(9)  NP---->  N  S 


These  expansions  yield  a  wide  range  of  sentences  including 
the  following/* 


(10) 


a.  the  fact  that  John  came  late  worries  me 

b.  it  appears  that  John  is  honest 

c.  John  appears  to  be  honest 

d.  we  stopped  worrying 

e.  I  dislike  being  here 

f.  she  believes  it  to  be  true  that  life  is  good 


The  phenomenon  of  noun  phrase  complementation  is  extremely 
productive  since  NP’s  appear  in  diverse  positions  in  deep 
structure;  noun  phrase  complementation  arises  in  all  distri¬ 
butions. 

In  addition  to  the  complex  sentence  formations  de¬ 
scribed  above,  Grammar  I  characterizes  such  simple  sentence 
phenomena  as  question  formation  (both  the  so-called  "yes -no" 
questions  and  the  "wh"  questions),  aspect,  passive  formation, 
certain  types  of  negation  phenomena,  and,  at  the  transforma- 

*7 

tional  level,  certain  indirect  object  and  prepositional  phrase 
constructions.  A  set  of  sentences  contained  in  English  I  which 
illustrates  the  major  generative  properties  of  Grammar  I  is 
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provided  in  Appendix  V. 

2.3  Deficiencies  of  Grammar  I 

Formal  grammars  invariably  suffer  from  at  least 
three  types  of  deficiency:  incompleteness,  incorrectness, 
and  theoretical  slack.  The  general  nature  of  these  deficien¬ 
cies  is  illustrated  below  with  respect  to  Grammar  I. 

2.  3.  1  Incompleteness 

It  is  a  truism  that  no  grammar  constructed  in  the 
foreseeable  future  will  generate  all  and  only  the  sentences 
of  an  arbitrary  natural  language.  There  are  a  number  of 
reasons  for  this,  but  acknowledgement  simply  of  the  immen¬ 
sity  of  natural  language  suffices.  Since  there  is  no  reason 
a  priori  to  assume  that  the  completeness  of  a  grammar  is 
either  necessary  or  sufficient  for  the  computational  utiliza¬ 
tion  of  formal  grammars  of  portions  of  natural  languages,  it 
is  important  to  recognize  the  bases  of  grammatical  incom¬ 
pleteness.  In  so  doing,  it  becomes  clear  that  certain  types 
of  incompleteness  are  more  susceptible  to  remedy  than  are 
others. 

First,  a  grammar  may  intentionally  omit  treatment  of 
a  particular  topic.  For  example.  Grammar  I  does  not  deal 
with  any  form  of  conjunction.  The  reason  for  this  is  that 
theoretical  support  for  the  description  of  conjunction  has  been 
lacking.  Only  recently  have  theoretical  developments  indi¬ 
cated  that  a  descriptive  study  of  this  phenomenon  might  be- 

O 

come  profitable.  It  is  currently  expected  that  Grammar  III 
will  contain  the  results  of  such  research. 

Second,  a  grammar  may  be  incomplete  because  of 
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simple  oversight.  To  draw  again  on  Grammar  I,  the  formula¬ 
tion  of  the  passive  transformation  in  this  grammar  does  not 
allow  for  the  generation  of  passive  sentences  containing  as¬ 
pectual  morphemes. ^  That  is,  Grammar  I  generates  "John 
is  teased  by  silly  girls",  but  not  "John  is  being  teased  by 
silly  girls."  Needless  to  say,  such  oversights  are  easily 
remedied. 

2.  3.  2  Incorrectness 

A  grammar  makes  claims  about  the  structure  and 
derivation  of  well-formed  sentences.  Often,  such  claims 
turn  out  to  be  incorrect.  Incorrectness  may  stem  from  a 
number  of  causes. 

First,  a  grammar  may  be  incorrect  because  it  is 
incomplete.  For  example,  in  generating  the  "non-exceptional" 
sentence  "John  didn't  want  to  behave  himself",  Grammar  I 
provides  a  mechanism  which  incorrectly  predicts  the  gram- 
maticality  of  "*John  said  to  behave  himself."  The  problem 
here  is  that  Grammar  I  lacks  a  mechanism  for  dealing  with 
exceptions.  The  grammar  is  not  basically  wrong;  it  is  sim¬ 
ply  incomplete,  and  this  incompleteness  leads  to  incorrectness. 

Second,  and  far  more  embarrassing,  a  grammar  may 
be  linguistically  incorrect  thereby  providing  an  incorrect 
analysis  for  generated  sentences.  For  example,  by  allowing 
both  progressive  and  regressive  deletion  in  relative  clause 
formation.  Grammar  I  makes  the  incorrect  claim  that  sen¬ 
tences  (41)  and  (12)  are  synonymous. 

(11)  I  discovered  the  mountain  which  John  is  admiring 

(12)  I  discovered  which  mountain  John  is  admiring 
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2.  3.  3  Theoretical  Slack 

Theoretical  slack  means  simply  that  the  linguistic 
theory  in  terms  of  which  a  grammar  is  constructed  is  insuf¬ 
ficiently  specific  to  allow  a  choice  among  competing  descrip¬ 
tions  of  the  same  phenomenon.  Grammar  I,  for  example,  is 
based  upon  a  version  of  linguistic  theory  which  allows  various 
properties  of  nouns,  verbs,  and  adjectives  to  be  described  in 
the  deep  structure  either  in  terms  of  constituent  structure  or 
in  terms  of  syntactic  features  subcategorizing  nouns,  verbs, 
and  adjectives.  To  take  a  specific  case,  number  on  nouns 
may  be  viewed  either  as  a  constituent  under  the  domination  of 
NP,  as  specified  in  the  rewriting  rules  of  (13),  or  as  the  syn¬ 
tactic  feature  (  singular)  positively  or  negatively  specified 
(where  (+ singular)  indicates  a  singular  noun  and  (-singular) 
indicates  a  plural  noun). 

(13)  NP  ---->  N  NU  (S) 


Grammar  I  adopts  the  feature  analysis  of  number,  but  this 
analysis  is  not  theoretically  determined.  From  the  point  of 
view  of  the  linguistic  theory,  this  analysis  is  arbitrarily 
selected.*® 

2.4  Directions  in  Ongoing  Grammatical  Research 

An  attempt  is  currently  being  made  to  remedy  many 
of  the  deficiencies  in  Grammar  I.  Grammar  II  will  correct 
the  oversights  of  Grammar  I  and  will,  furthermore,  treat 
such  syntactic  phenomena  as  pronominalization,  reflexiviza- 
tion,  genitivization,  time  and  place  adverbials,  and  verb- 
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preposition  restrictions.  In  addition,  corrections  in  the 
analyses  are  being  made,  e.  g. ,  for  progressive  and  regres¬ 
sive  deletion  in  relative  clause  formation,  and  a  facility  for 

1 1 

the  treatment  of  exceptions  is  being  added. 

Finally,  and  by  far  the  most  important,  Grammar  II 
will  be  consistent  with  a  revised  linguistic  theory  which  calls 
for  the  introduction  of  grammatical  items  such  as  articles, 
affixes,  prepositions,  aspectual  morphemes,  complementi¬ 
zers,  and  the  like  on  the  basis  of  the  generated  syntactic 
feature  composition  of  the  formative s  N  (noun)  and  VB  (verbs 
and  adjectives).  The  introduction  of  such  items  involves  the 
process  of  transformational  segmentalization.*^  it  is  gen¬ 
erally  significant  that  such  a  version  of  linguistic  theory  pro¬ 
vides  a  near  approximation  to  universal  deep  structure.  (It 
seems  clear,  in  any  case,  that  potential  computer  applica¬ 
tions  involving  the  analysis  of  English  sentences  will  be 
greatly  facilitated  by  the  removal  of  material  entirely  idio¬ 
syncratic  to  English  from  the  deep  structure.)  The  phrase 
structure  rewriting  rules  for  Grammar  II  are  given  in  Ap¬ 
pendix  VI. 

3.  0  THE  IMPLEMENTATION  OF  A  SENTENCE 
SYNTHESIZER 

3.  1  General  Description 

A  computer  program  for  synthesizing  sentences  on 
the  basis  of  Grammar  I  was  implemented  as  several  functions 
in  LISP  1.  5.  The  top  level  function,  Derivf],  requires  an 
expanded  grammar^  and  a  set  of  derivation  control  cards. 
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The  latter  consists  of  1)  a  rewriting  subrule *4  specification 
for  the  generation  of  a  particular  deep  structure  (syntactic 
feature  vectors  being  also  introduced  by  such  rules  in  an  ad 
hoc  manner)  and  2)  a  specification  (possibly  null)  of  the 
optional  transformational  rules  whose  applicability  is  to  be 
tested.  Derivf],  in  generating  deep  structures,  sequentially 
tests  each  rule  in  the  expanded  grammar  for  applicability. 
Establishing  the  applicability  of  a  rule,  Derivf  ]  determines 
whether  a  subrule  of  this  main  rule  is  specified  on  a  deriva¬ 
tion  control  card.  Upon  success,  the  subrule  so  specified  is 
applied.  For  deep  structure  generation,  rule  application  in¬ 
volves  the  replacement  of  the  applicable  left-hand  symbol 
found  in  the  current  terminal  string  of  symbols  with  the  right- 
hand  constituency  of  the  specified  subrule.  At  the  same  time, 
a  P-marker  is  constructed  which  reflects  the  current  state  of 
the  deep  structure.  Upon  successful  subrule  application,  a 
new  derivation  control  card  is  read  and  the  applicability  of 
the  same  rule  is  retested.  When  a  rule  is  not  applicable, 
Derivf]  proceeds  to  test  the  applicability  of  the  following  rule. 
Derivf]  provides  a  printed  record  of  the  subrule  applied,  the 
current  P-marker,  and  the  current  terminal  string.  Derivf] 
applies  in  cycles  to  each  unexpanded  S  node.  When  no  more 
unexpanded  S' s  remain,  Derivf]  calls  Derivtransf ],  which 
tests  the  applicability  of  all  obligatory  transformations  and 
the  specified  optional  transformations,  and  converts  the  deep 
structure  into  a  surface  structure  by  applying  transforma¬ 
tions  to  current  P-marker s  falling  under  their  domain. 

Derivtransf]  proceeds  sequentially  through  the  trans¬ 
formational  rules  of  the  expanded  grammar  testing  each  for 
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applicability.  In  the  event  that  a  proper  analysis  is  obtained 
(i.  e. ,  where  the  pattern  specified  in  the  structural  index  of  a 
transformation  is  found  in  the  current  P-marker),  Deriv- 
trans[]  calls  Dotran,  which  applies  the  transformation  to  the 
P-marker  under  either  one  of  the  following  two  conditions: 
First,  the  transformation  is  obligatory.  Second,  the  trans¬ 
formation  is  an  optional  transformation  specified  on  the  cur¬ 
rent  derivation  control  card.  Derivtransf]  tests  the  set  of 
cyclic  transformations  for  each  deepest  S.  When  no  more 
,  deepest  S's  remain,  this  function  tests  the  applicability  of  the 
set  of  post-cyclic  transformations. 

3.  2  The  Pattern-Matching  Function:  Syntax  and 
Semantics 

The  heart  of  Derivtransf]  is  the  procedure  for  ob¬ 
taining  a  proper  analysis  in  a  current  P-marker.  The  function 
P-a  allows  the  specification  of  the  patterns  in  the  structure 

indices  of  transformations  and  identifies  these  patterns  in  P- 

1 5 

markers.  This  function  has  the  following  form: 

(14)  P-a  [( nodeilist);  (pattern);  (names);  (impairs)] 


3.  2.  1  Syntax  and  the  Modeling  of  Transformations 
Subtrees  in  P-markers  are  represented  as  LISP  s- 
expression*.  Consider  for  example,  the  P-marker  below. 


(15) 


/\ 
DET  N 


V 
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This  P-marker  is  represented  as  the  following  s-expression. 

(16)  (S(NP(DET)(N))(VP(V){NP(DET)(N)))) 

The  first  argument  to  P-a  is  a  list  of  sister  nodes, 

( node:list)  :=  ((  node)*).  A  node  is  a  constituent  and  all  the 
constituents  which  it  dominates  in  a  P-marker,  (node)  := 

((  constituent)  ( node )*)  and  (constituent)  :=  (atom).  For 
instance,  the  node  NP  in  the  P-marker  (15)  is  represented 
as  follows: 

(17)  (NP(DET)(N)) 

A  representative  list  of  sister  nodes  supplying  a  first  argu¬ 
ment  to  P-a  might  be  the  following: 

(18)  ((NP(DET)(N))(VP(V)(NP(DET)(N)))) 

The  second  argument  to  P-a,  (pattern),  is  a  list  con¬ 
sisting  of  an  optional  left-anchor*^*  followed  by  an  indefinite 
number  of  pattern  elements  to  be  matched,  (pattern)  := 

( { $0  j  0  }  ( patter n:element )  *). 

The  third  argument  to  P-a,  ( names ),  is  a  list  of 
names,  (names)  :=  ((name)*),  where  (name)  :=  (identifier) 
|  (number).  For  each  member  of  the  pattern  list  there  is  a 
corresponding  name  on  the  names  list.  Names  are  used  to 
index  matched  nodes  for  subsequent  reference.  The  conven¬ 
tion  in  this  system  is  to  supply  positive  integers  beginning 
with  1  as  names. 

The  fourth  argument  to  P-a,  ( impairs ),  is  a  list  of 
names  paired  with  the  matched  nodes  indexed  by  these  names, 
(impairs)  :=  ((( name) {< node)  |  (node:list)|  ( impairs  )})*). 


i 
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Pattern  elements  are  of  especial  interest  because  they 
characterize  the  devices  available  to  the  linguist  in  modeling 
structure  indices. 

(19)  ( patternjelement)  :=  {{)|NIL}|  (literal) 

|  (alternation)  |  $|  (  special :form)  |  (p:form) 

The  pattern  element  {( )|NIL}  is  the  empty  pattern  element 
and  is  used  to  model  optional  non-empty  patterns  in  a  struc¬ 
ture  index.  The  pattern  element  (literal)  specifies  the  con¬ 
tent  of  a  node  to  be  matched.  This  element  thus  models  the 
constituents  in  a  structure  index.  Sets  of  alternative  pattern 
elements  are  modeled  as  an  alternation,  (alternation)  := 

(=OR  (patternjelement)*).  The  pattern  element  $  models 
the  structural  variables  often  found  in  transformational  rules. 
The  pattern  element  (specialjform),  where  (specialjform)  := 

(  conjunction)  |  (  matchingjfunction)  |  (  dominancejconstraint), 
is  used  by  the  linguist  to  model  conjunctions  within  an  alter¬ 
nation,  to  introduce  special  matching  conditions  such  as 
identity  and  constraints  on  syntactic  features,  and  to  test 
constraints  on  dominance  of  elements  within  structure  in¬ 
dices.  Finally,  (pjform),  which  was  not  employed  in 
Grammar  I  as  a  pattern  element,  is  of  use  in  matching  un¬ 
usual  tokens  such  as  +,  $,  etc. 

3.  2.  2  The  Pattern-Matching  Algorithm 

P-a  attempts  to  find  a  proper  analysis  for  a  pattern 
specification  in  various  ways  depending  upon  the  character 
of  the  pattern  elements  in  the  pattern  specification.  Begin¬ 
ning  with  some  candidate  node,  P-a  tries  to  match  the  first 
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pattern  element  with  that  node.  On  success,  P-a  tries  to 
match  the  next  pattern  element  with  the  next  contiguous  node*^ 
in  the  P-marker.  If  all  pattern  elements  are  matched  suc¬ 
cessfully,  the  function  returns  as  its  value  a  list  of  matched 
nodes  and  names  (mrpairs)  paired  with  the  then  current  can¬ 
didate  node  (which  may  be  null).  If  a  pattern  element  fails  to 
match  a  candidate  node,  the  left-hand  daughter  of  the  current 
candidate  node  becomes  the  candidate  node  and  the  function  is 
reapplied.  If,  under  these  circumstances,  no  left-hand 
daughter  exists,  then  the  function  has  not  found  a  proper 
analysis  and  returns  NIL. 

Various  pattern  elements  affect  this  procedure  in 
various  ways.  The  null  pattern  element  causes  the  current 
name  to  be  paired  with  the  empty  fragment  (empty  list)  and 
added  to  the  mpairs  list.  The  literal  pattern  element  matches 
only  nodes  whose  content  is  identical  to  itself.  The  alterna¬ 
tion  pattern  element  causes  a  match  just  in  case  one  of  the 
successively  examined  patterns  causes  a  match  when  sub¬ 
stituted  for  the  alternation  as  the  current  pattern  element. 

The  element  $  matches  a  fragment  (an  indefinitely 
long  list  of  nodes)  in  the  following  manner.  If  the  current 
name  does  not  appear  on  the  mpairs  list,  the  name  is  paired 
on  the  mpairs  list  with  the  empty  fragment. 

1.  If  a  proper  analysis  can  then  be  found  for  the  rest  of 
the  current  pattern  (the  pattern  elements  remaining  to  the 
right  of  the  $  element)  beginning  with  the  current  candidate 
node,  this  proper  analysis  is  given  as  the  resulting  mpairs 
list  (i.e.,  P-a  [nodes;  cdr[pattern];  cdr[names];  mjpairs]. 
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2.  Otherwise,  the  current  candidate  node  is  appended  to 
the  current  $  fragment  to  yield  mpairs'  and  the  next  contigu¬ 
ous  node  is  taken  as  a  current  candidate'.  If,  on  applying  P-a 
recursively,  a  proper  analysis  is  found  for  the  current  pat¬ 
tern  (with  the  current  $  as  firat  element)  beginning  with  the 
current  candidate',  then  that  proper  analysis  is  given  as  the 
resulting  mpairs  list,  (i.e.,  P-a  [  nxtcontiguous[ noderlist]; 
pattern;  names;  impairs']).  In  this  way  the  $  is  extended 
over  next  contiguous  nodes. 

3.  On  the  failure  of  both  1  and  2,  the  left-most  daughter 
of  the  current  candidate  node  examined  in  1  (not  the  current 
candidate')  is  taken  as  the  new  current  candidate  node  and 
step  1  is  reinitiated.  If  no  such  left-hand  daughter  exists, 
then  the  pattern  match  fails  and  the  value  of  P-a  is  NIL. 

3.  2.  3  Some  Further  Syntactic  Considerations 

Within  P-a,  pforms,  where  (p:form)  :=  (QUOTE 
(  s -expression))  |  <  backireference )  |  (  subscriptrreference ) 
j  (*  (form))  |  (*K  (arg)*),  are  used  in  a  variety  of  ways. 
First,  inasmuch  as  certain  possible  node  contents  are  not 
allowable  as  literals,  e.  g. ,  $,  the  pattern  element  pform 
provides  a  useful  facility.  A  pform  is  first  evaluated  and 
the  resulting  value  is  taken  as  a  literal.  The  value  of  the 
pform  (QUOTE  (s -expression))  is  simply  the  associated  s- 
expression.  If  a  node  previously  matched  is  required,  e.  g. , 
as  is  often  the  case  in  special  forms,  either  a  back  reference, 
where  (back:reference )  :=  (name),  ora  subscript  reference, 
(subscript;reference)  :=  (S/name#),  must  be  employed.  The 
pform  (*  (form))  causes  the  form  to  be  LISP  evaluated, 
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e.  g. ,  (*  DOLLAR)  yields  $.  Finally,  argument  functions 
yield  as  values  the  values  of  the  LISP  function,  (function), 
applied  to  the  evaluated  arguments.  This  device  makes 
available  the  full  power  of  the  LISP  language  for  the  genera¬ 
tion  of  arguments. 

The  pattern  element  (  special:form)  is  either  a  con¬ 
junction,  a  matching  function,  or  a  dominance  constraint. 

The  value  of  a  successful  special  form  is  an  mpairs  list 
rather  than  a  node  or  list  of  nodes.  Thus,  mpairs  lists  may 
appear  on  mpairs  lists  and  elements  of  such  "contained" 
mpairs  lists  must  be  referred  to  by  subscript  references. 

Conjunction,  where  (conjunction)  :=  (=AND  (pattern: 
element)*),  treats  a  pattern  as  a  pattern  element,  as  in  an 
alternation  where  an  alternat-  is  a  sequence  of  nodes. 

Matching  functions*®  provide  an  escape  mechanism 
from  the  matching  algorithm.  This  device  allows  the  linguist 
to  state  a  very  wide  range  of  special  conditions  on  P-markers 
not  otherwise  specifiable.  Consider  the  following  matching 
function  form: 

(FN  (function)  arg  arg  ...  arg  ) 

1  w  11 

(function)  is  a  LISP  defined  function  of  n+1  arguments  where 
the  first  argument,  the  current  candidate  node,  is  implicitly 
supplied  by  the  pattern  element  interpreter.  Matching  func¬ 
tions  obey  the  following  conventions: 

1.  If  the  matching  function  fails,  it  returns  the  value  NIL. 

2.  On  success,  the  matching  function  returns  a  non-null 
value  (usually  an  mpairs  list). 
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P-a  does  not  descend  into  syntactic  feature  vectors ^ 
(complex  symbols).  Where  conditions  on  transformations  in¬ 
volve  complex  symbols  for  Grammar  I,  they  are  implemented 
in  terms  of  the  following  special  matching  functions. 

1.  (FN  FEATURE  arg  arg  ),  where  arg  evaluates  to  a 

1  4  J. 

terminal  node  (e.  g. ,  N,  V)  dominating  a  syntactic  feature 
vector  and  where  arg^  evaluates  to  a  list  of  features  which 
must  be  contained  in  the  feature  vector.  Consider,  for  ex¬ 
ample,  the  "WH  pronoun"  transformation. 


1  2  3  4====> 

12  0  4 

The  structure  index  for  this  transformation  is  modeled  as 
follow  s : 

(21)  ($  DEF  N  WH  (FN  FEATURE  3  (QUOTE(/Sg  /PRO)))) 

2.  (FN  ALPHA  arg^  arg^)  is  identical  to  the  above  FN 
except  that  the  coefficient  of  the  syntactic  feature(s)  of  arg^ 
is  ignored.  This  matching  function  is  useful  for  modeling 
structure  indices  employing  the  variable  coefficient,  e.  g. , 
(flC),  in  the  Complementizer  Placement  Transformation. 

The  syntax  of  a  dominance  constraint  on  P-markers 
is  as  follows: 

(22)  (  dominance:constraint)  :=  (***  ( arg >(  pattern)) 

The  first  argument  specifies  a  previously  matched  node.  The 
second  argument  is  the  pattern  to  be  matched  in  the  subtree 
dominated  by  the  node  specified  by  the  first  argument.  The 
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value  of  the  pattern  element  is  either  an  mpairs  list,  on  suc¬ 
cess,  or  NIL.  The  structure  index 

A  B  (C  D]e 
12  3  4 

is  modeled  as  follows: 

(23)  (A  B  E(***  3 (QUOTE  ($0  C  D)))) 

3.3  The  Pattern  Transformation 

P-markers  are  transformed  by  the  LISP  function 
Dotranfu;  tr;  ct]  where  u  is  an  mpairs  list  produced  by  the 
successful  application  of  P-a,  ct  is  the  segment  of  the  P- 
raarker  falling  under  the  domain  of  the  transformation  iden¬ 
tified  by  its  highest  node  and  tr  is  a  list  specifying  the 
transformation. 

(tr)  :=  ((replacement)*) 

(replacement)  :=  ( ( arg)  (arg)*) 

The  first  argument  of  a  replacement  is  either  a  node  produced 
as  the  value  of  a  back  reference  or  subscript  reference,  or 
else  is  an  argument  function.  In  the  first  case,  the  remaining 
arguments  specify  the  nodes  which  are  to  replace  the  node 
referenced  by  the  first  argument.  In  the  second  case,  an 
argument  function  may  be  employed  to  modify  ct  in  some 
other  manner  than  by  sister  adjunction,  substitution,  or 

deletion.  ^ 

Consider  how  the  relative  placement  transformation 
of  Grammar  I  is  modeled. 
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(24)  #  X  ART  S  N  Y  # 

1  2  3  4  5  6  7====> 

1  2  3  0  5+  4  6  7 

(25)  ((  //  $  ART  S  N)(4  0)(5  5  4)) 

Observe  that  deletion  involves  replacement  of  a  node  by  0. 

4.  0  THE  LINGUISTIC  BASES  OF  HIGH  SPEED  SENTENCE 
GENERATION 

Perusal  of  the  transformational  rules  in  Grammar  I 
will  reveal  that  structural  variables  appear  quite  often  as  the 
first  and  last  pattern  elements  of  a  structure  index.  These 
variables  capture  the  linguistically  significant  generalization 
that  a  particular  transformational  process  is  completely  in¬ 
dependent  of  the  constituent  material  falling  under  the  domain 
of  such  variables  in  a  particular  P-marker.  This  generaliza¬ 
tion  is  not  reflected  in  P-a,  which  must  search  through  a 
P-marker  for  an  occurrence  of  a  pattern  element  even  though 
much  of  the  constituent  structure  which  is  traversed  in  this 
process  may  fall  under  the  domain  of  the  structural  variable. 
Needless  to  say,  such  irrelevant  searching  is  time-consuming. 

To  a  certain  degree,  the  situation  is  salvageable  inas¬ 
much  as  the  imposition  of  an  arbitrary  depth-of-embedding 
constraint  on  the  rewriting  rules  renders  it  possible  in  prin¬ 
ciple  to  state  transformations  in  terms  of  literals.  In  other 
words,  it  becomes  possible  to  specify  all  of  the  constituent 
structure  material  ordinarily  subsumed  under  the  variable. 

At  the  very  least,  such  a  procedure  is  lingustically  distaste¬ 
ful.  Worse,  however,  are  the  consequences  if  the  grammar 
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under  study  is  even  minimally  complex,  e.  g. ,  handles  com¬ 
plementation.  Under  such  circumstances,  a  vast  number  of 
environments  would  have  to  be  specified  for  the  most  trivial 
of  rules.  Consider,  for  example,  the  number  of  "left"  en¬ 
vironments  Grammar  I  would  be  forced  to  provide,  even  if 
the  depth  of  embedding  were  arbitrarily  limited,  for  the 
trivial  post-cyclic  rule  which  assigns  an  afffo.  to  plural  nouns. 

There  is  little  question  that  the  routine  for  obtaining  a 
proper  analysis  which  we  have  devised  is  not  optimal  and  that 
greater  processing  efficiency  can  be  expected  in  future  ver  ¬ 
sions  of  SSP.2*  Our  efforts  to  solve  the  problem  of  the  struc¬ 
tural  variable  have  been  based  on  the  assumption  that  the 
fastest  routine  for  obtaining  a  proper  analysis  is  no  routine 
at  all.  Less  glibly,  we  have  addressed  two  questions.  If  a 
particular  P-marker  does  not  fall  under  the  domain  of  a 
transformation,  is  it  possible  1)  to  prevent  entry  into  the 
proper  analysis  routine  in  the  first  place,  and  2)  if  not,  to 
achieve  a  rapid  termination  of  this  routine  ?  The  answer  to 
both  of  these  questions  is  yes.  We  refer  to  techniques  ac¬ 
complishing  these  tasks  as  fast-fail  procedures. 

4.  1  How  Not  to  Proper  Analyze 

4.  1. 1  Node  Listing 

Node  listing  is  a  simple  procedure  for  obviating  a 
proper  analysis  which  takes  advantage  of  the  fact  that  a 
necessary  (though  not  sufficient)  condition  for  obtaining  a 
proper  analysis  is  that  the  P-marker  in  question  must  con¬ 
tain  every  literal  contained  in  the  structure  index.  Before 
the  application  of  P-a,  a  test  is  made  on  the  P-marker  to 
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determine  whether  it  contains  a  single  instance  of  literals 
supplied  appropriately  with  each  transformation.  This  pro¬ 
cedure  turned  out  to  be  of  minimal  value  since  most  P- 
markers  contain  most  constituents. 

4.  1.  2  Sister  Listing 

Of  far  greater  value  is  a  similar  procedure  which 
tests  the  P-marker  for  the  sisterhood  of  two  or  more  con¬ 
stituents  mentioned  as  literals  in  a  structure  index.  For 
example,  the  Relative  Placement  Transformation  mentions 
the  sisters  ART  and  S.  Node  listing  would  be  of  little  value 
since  virtually  all  P-markers  contain  ART  and  all  P-markers 
do,  in  fact,  contain  S.  The  sister  listing  procedure  checks 
the  current  P-marker  to  determine  whether  ART  and  S  are 
contained  somewhere  as  sisters.  Such  a  test  will  be  suc¬ 
cessful  just  in  case  the  P-marker  contains  a  relative  clause 
construction.  Otherwise,  it  will  fail  and  Derivtrans[]  will 
immediately  proceed  to  the  next  transformation. 

4.  2  Fast  NIL  for  P-a 

4.  2. 1  Terminalizing 

Careful  study  of  Grammar  I's  transformational  rules 
shows  that  the  literals  mentioned  in  the  structure  indices  of 
these  rules  are  quite  often  terminal  symbols,  i.  e. ,  symbols 
which  uniquely  appear  in  the  terminal  strings  of  P-markers. 
This  fact  suggests  the  possibility  of  reducing  considerably 
the  work  which  must  be  performed  by  P-a  in  the  event  that 
entry  into  this  routine  is  unavoidable.  More  specifically, 
observe  that  when  the  structure  index  of  a  transformation 
contains  only  terminal  symbols,  a  search  by  P-a  of  the 
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non-terminal  constituents  of  the  current  P-marker  is  unneces¬ 
sary.  Such  redundant  searching  can  be  obviated  by  requiring 
P-a  to  apply  to  a  temporary  P-marker,  P-marker*,  which 
contains  the  highest  node,  S,  and  the  terminal  nodes  of  the 
original  P-marker,  but  none  of  its  intermediate  nodes.  This 
"tree  pruning"  procedure,  which  turns  out  to  be  extremely 
valuable  and  will  be  even  more  so  for  Grammar  II,  is  called 
terminalizing. 

4.  2.  2  Node  Weighting 

When  entry  into  P-a  is  unavoidable  and  when  the  struc¬ 
ture  index  ex  the  current  transformation  contains  non-terminal 
as  well  as  terminal  symbols,  terminalizing  is  impossible. 

The  only  fast-fail  procedure  currently  operating  in  SSP  for 
reducing  the  amount  of  work  done  by  P-a  in  this  instance  is 
node  weighting.  This  procedure  takes  advantage  of  the  fact 
that  the  number  of  eligible  proper  analysis  nodes  in  the  cur¬ 
rent  P-marker  must  always  be  equal  to  or  greater  than  the 
number  of  literal  pattern  elements  remaining  to  be  matched. 
Under  this  procedure  each  pattern  element  is  assigned  a  weight 
reflecting  a  minimum  node  requirement  for  the  current  P- 
marker.  Similarly,  nodes  in  the  current  P-marker  are  as¬ 
signed  weights  in  accordance  with  their  position  in  the  P- 
marker  for  possible  proper  analysis.  At  such  time  as  the 
weight  of  the  pattern  element  exceeds  the  weight  of  the  node 
being  examined,  P-a  terminates. 

Grammar  II,  which  has  considerably  different  proper¬ 
ties  than  Grammar  I,  presents  several  possibilities  for  ef¬ 
fective  fast-fail  procedures.  We  plan  to  report  on  these  at  a 
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5.  0  THE  USE  OF  THE  SENTENCE  SYNTHESIZING 
PROGRAM 

5.1  Research  Goals  and  Their  Consequences 

Our  conclusions  concerning  the  use  of  SSP  in  the 
development  of  English  Grammar  I  only  have  meaning  in 
terms  of  our  research  goals.  Our  central  research  goal  in 
computational  linguistics  is  to  install  in  an  electronic  com¬ 
puter  a  knowledge  of  natural  language  (English  in  the  present 
case)  which  reflects  the  English  speaker's  ability  to  relate 
the  form  of  a  sentence  to  its  meaning.  Inasmuch  as  the  im¬ 
mensity  of  English^  renders  impossible  a  full  reconstruc¬ 
tion  of  this  knowledge  in  the  form  of  a  transformational 
grammar,  our  less  ambitious  goal  is  to  construct  a  trans¬ 
formational  grammar  for  a  subset  of  English  sentences  which 
is  both  useful  and  learnable.  The  usefulness  of  such  a  subset 
is  completely  a  function  of  its  expressive  power.  J  Clearly, 
the  existence  of  such  a  subset  is  meaningless  if  this  subset 
(which,  in  all  likelihood,  will  be  infinite)  is  not  learnable. ^ 
This  goal  establishes  three  requirements  for  compu¬ 
tational  linguistic  research.  The  first  is  to  pursue  topics  in 
linguistic  theory  since  it  is  generally  true  that  the  more  ad¬ 
vanced  the  linguistic  theory,  the  more  general  are  the  gram¬ 
mars  whose  form  is  a  consequence  of  this  theory.  The  sec¬ 
ond  is  1)  to  develop  precisely  specified  grammars  which  are 
descriptively  correct  with  respect  to  the  assignment  of  deep 
structures  to  surface  structures  and  2)  to  study  computational 


procedures  for  implementing  these  grammars.  The  third  is 
to  study  the  useability  and  learnability  of  the  subsets  of  natural 
language  generated  by  these  grammars.  Our  views  on  the  use 
of  SSP  are  couched  in  terms  of  these  considerations. 

5.  2  General  Conclusions 

5.  2. 1  Sentence  Synthesis  and  Linguistic  Theory 

The  relation  obtaining  between  linguistic  theory  and  a 
sentence  synthesizing  program  is  one  of  specification,  in  that 
the  linguistic  theory  specifies  the  form  of  the  grammar  which 
is  implemented  by  the  program.  This  fact  is  perhaps  discon¬ 
certing  since,  inasmuch  as  constancy  over  time  has  not  yet 
become  a  property  of  transformational  linguistic  theory,  it 
suggests  the  necessity  of  constant  revision  for  the  SSP.  In  a 
weaker  moment,  one  may  fancy  an  SSP  which  allows  a  linguist 
to  make  arbitrary  changes  in  his  theoretical  formulation,  but 
recognition  of  the  utter  nonexistence  of  discovery  procedures 
for  linguistic  theories,  i.  e. ,  the  complete  lack  of  any  basis 
for  projecting  future  developments,  persuades  us  that  such  a 
device  is  an  impossibility. 

It  is  always  possible  to  take  the  linguistically  and,  in 
the  long  run,  computationally  unfortunate  option  of  theoretical 
compromise,  thus  constructing  a  theoretically  antiquated 
grammar  for  the  sake  of  computation.  If,  however,  the  goal 
is  to  develop  a  grammar  that  is  theoretically  sound,  it  is  then 
our  conclusion  that  the  major  responsibility  for  developing 
and  maintaining  an  adequate  SSP  belongs  to  the  linguist  and 
not  to  the  programmer.  There  is  no  computational  procedure 
for  resolving  difficulties  inherent  in  linguistic  methodology. 
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5.2 .2  Sentence  Synthesis  and  the  Construction  of 
Grammars 

Eschewing  linguistic  description  for  its  own  sake, 
there  are  two  reasons  for  constructing  transformational 
grammars.  The  first  of  these  is  to  confirm  or  disconfirm 
theoretical  hypotheses.  Since  generative  rigor  involving  the 
construction  of  anything  more  than  a  grammatical  sketch  has 
never  been  a  necessary  condition  for  the  wholehearted  ac¬ 
ceptance  or  rejection  of  such  hypotheses,  the  usefulness  of  a 
sentence  synthesizing  program  in  the  construction  of  such  a 
grammar  segment  is  extremely  doubtful.  On  the  other  hand, 
a  necessary  condition  for  computer  applications  based  upon 
transformational  grammars  is  the  generative  correctness  of 
large  descriptive  grammars.  In  this  respect,  a  sentence 
synthesizing  program  which  tests  the  rules  of  the  grammar 
is  a  must,  as  anyone  who  has  studied  the  mind-warping  pro¬ 
perties  of  complex  transformational  grammars  will  readily 
appreciate. 

It  is  our  observation  that  the  uses  of  a  sentence  syn¬ 
thesizing  program  are  most  reasonably  determined  not  so 
much  by  the  grammar  but  by  the  applications  requiring  the 
grammar.  Specifically,  the  facility  which  is  critical  to  com¬ 
puter  applications  involving  the  speaker's  knowledge  of  his 
language  is  the  transformational  reconstruction  of  the  rela¬ 
tion  obtaining  between  the  form  of  a  sentence  and  its  meaning, 
between  surface  structure  and  deep  structure.  This  relation 
is  precisely  specified  by  the  transformational  rules  of  the 
grammar.  The  implication  here  is  that  there  is  simply  no 
good  reason  to  provide  a  computer  implementation  of  those 


facilities  of  a  transformational  grammar  which  do  not  have 
direct  bearing  on  the  relation  between  meaning  and  form. 

This  assertion  is  reflected  in  our  inability  to  find  any  use  for 
that  extensive  component  of  SSP  which  allows  the  expansion 
of  the  rewriting  rules  of  Grammar  I  and  the  automatic  genera¬ 
tion  of  deep  structures.  In  the  testing  of  transformational 
rules,  a  generative  phrase  structure  component  is  just  so 
much  baggage,  and  we  normally  introduce  deep  structures 
for  Derivtransf]  by  hand.  We  are  chagrined  to  have  spent 
so  much  time  developing  a  subroutine  which  turns  out  to  have 
neither  linguistic  nor  applicational  significance  (save  perhaps 
for  an  audio-  visual  aid  in  Linguistics  I).  Much  as  a  result 
of  this  unfortunate  experience,  we  have  not  implemented  a 
blocking  facility  for  terminating  a  derivation  where  two  con¬ 
stituents  are  required  to  be  identical,  but,  in  a  particular 
P-marker,  are  not.  An  adequate  transformational  grammar 
must  no  doubt  provide  such  a  blocking  facility,  but  the  neces¬ 
sity  for  computer  implementation  of  this  facility  is  doubtful 
since,  on  the  theoretical  hand,  the  theoretical  claim  made  by 
any  such  blocking  device  could  be  trivially  evaluated  manually 
and  since,  on  the  applicational  hand,  deep  structures  exem¬ 
plifying  such  cases  of  non-identity  would  never  arise.  A 
blocking  mechanism  in  the  sentence  synthesizing  program 
itself  would  provide  nothing  more  than  a  laboratory  curiosity. 

SSP  is  of  great  value  in  answering  the  following  ques¬ 
tion:  Given  deep  structure  D,  does  the  set  of  transformational 
rules  generate  surface  structure  S?  Most  dramatic  are  those 
cases  where  the  transformations  generate  an  incorrect  sen¬ 
tence  S’.  An  illustrative  example  concerns  the  transformation 


which  assigns  the  affix  "s"  to  plural  nouns  in  Grammar  I. 

X  [<-Sg>JN  Y 

1  2  3====> 

1  2  +  s  3 

This  rule  asserts:  If  a  noun  carries  the  syntactic  feature 
{-singular)  then  add  an  "s"  under  the  domination  of  this 
noun  regardless  of  all  other  aspects  of  the  environment. 
Since  P-markers  are  reanalyzed  after  the  application  of  a 
transformation^  for  the  reapplicability  of  the  same  trans¬ 
formation,  the  above  transformation  will  and  did  reapply 
indefinitely  producing  strings  like  "noun  s  s  s  s  s  .  .  . 

This  consequence  resulted  from  the  fact  that  the  morpheme 
"s"  was  subsumed  under  the  variable  Y  on  each  reanalysis 
by  P-a. 

Such  cases  as  the  above  are  fairly  uncommon.  More 
often  than  not,  the  output  of  SSP,  when  it  turns  up  an  error, 
is  completely  undramatic  since  when  something  is  wrong  the 
transformation  most  commonly  does  not  apply  at  all  and  the 
output  is  some  intermediate  structure.  This  circumstance 
was  especially  unnerving  while  the  SSP  was  being  debugged 
since  it  was  not  always  easy  to  determine  whether  a  rule 
failed  to  apply  because  it  would  not  or  because  it  could  not. 

Summarizing,  SSP  has  turned  up  errors  of  the  fol¬ 
lowing  sort: 

1.  Transformational  rule  applies  incorrectly. 
Transformational  rule  fails  to  apply. 


2. 
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Reasons: 

a.  Traffic  information  incorrect,  e.  g. ,  obligatory  rule 
marked  optional. 

b.  Transformational  rule  stated  incorrectly. 

c.  P-marker  stated  incorrectly. 

Finally,  mention  should  be  made  of  the  fact  that  SSP 
operates  in  what  might  be  called  an  automatic  mode  in  trans¬ 
formational  generation  whereby  all  obligatory  transforma¬ 
tions  are  tested  against  a  P-marker  without  specification  on 
a  derivation  control  card.  Only  optional  rules  are  so  speci¬ 
fied.  It  is  often  the  case,  however,  that  linguistic  attention 
is  focussed  on  particular  transformations  and  the  effects  of 
others  are  beside  the  point.  For  such  cases,  which  arise 
very  often,  we  are  developing  a  manual  mode  of  operation 
for  the  sentence  synthesizing  program  which  will  implement 
Grammar  II.  In  this  manual  mode,  the  linguist  will  specify 
all  transformations  that  he  wishes  to  be  tested  against  a 
particular  P-marker.  The  manual  mode  provides  the  lin¬ 
guist  with  less  information  than  the  automatic  mode,  but 
such  omitted  information  is  often  superfluous  and  can  be 
profitably  sacrificed  for  speed. 
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making  full  use  of  full  glory  of  the  English  language? 

24 

P.  S.  Rosenbaum,  A,  Baldwin,  J.  Samsky,  "On  the  Use- 
ability  and  Learnability  of  a  Transformationally  Gener¬ 
ated  Subset  of  English,"  (forthcoming). 


This  requirement  applies  only  to  Grammar  I.  The  theory 
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ply  to  all  legitimate  proper  analyses  "simultaneously" 
as  it  were.  After  the  application  of  a  transformation, 
the  transformed  P-marker  is  not  reanalyzed  for  the 
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APPENDIX  I 


Sample  Deep  Structure 


Actual  string:  Bill  would  prefer  for  John  not  to  have  dreamed 


APPENDIX  II 


S 

PRE  --- 
AUX  --- 

T 

VP  --- 

PP  --- 
MAN  --- 
NP  --- 


Phrase  Structure  Rewriting  Rules 
for  Grammar  I 


->  #  (PRE)  NP  AUX  VP  # 

->  (NEG)  (Q) 

->  T  (M) 


-> 


-> 


-> 


-> 


-> 


PREP  NP 
PREP  P 
(DET)  N  (S) 


!WH){iDndef} 


ART  - > 


APPENDIX  III 
Sample  Lexicon  ^ 


Syntactic  Category 


<+N> 

boy 

( +  ADJ ) 

<+v> 

slay 

<+M> 

Strict  Subcategorization 

Verbs  <+V) 

Nouns  (+N) 

<+ _ NP  S) 

tempt 

( +DET _ S> 

<+ _ NP) 

disappoint 

( +DET  > 

<+_PP> 

approve 

<  +  _S> 

<+_s> 

condescend 

<+_> 

<+_> 

elapse 

Inherent  Subcategorization 

Nouns  { +N  ) 

( + human  ) 

boy 

( +  animate  ) 

<  +ab8tract) 

blame 

<  -animate) 

<  -abstract) 

Selectional  Subcategorization 
Verbs  (+V) 

<+_  <+_S>> 

<+  <+ _ S >  S  AUX _ > 

<+ _ (DET)  (+human)  PREP  (+ _ S>> 

Adjectives  <  +ADJ  > 

<+  <+ _ S)  S  AUX  > 


honest 

must 


fact 

teapot 

it 

John 

mongoose 

table 

suppose 

bother 

remind 


obvious 


/  / 


XT'- 


f 


i 
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APPENDIX  IV 
Transformational  Rules 


I.  CYCLIC  RULES 

1.  CP  1  Complementizer  Placement  1 


rT  * 

# 

X 

[<«C»N  NP+< 

be 

have 

v  . 

► 

Y 

# 

1 

2 

3  4 

5 

A 

II 

II 

II 

II 

vO 

1 

2 

3  eC  +  4 

5 

6 

2. 

CP  2  Complementizer  Placement  2 

OB 

I 

‘T  4 

# 

X 

[<+C)]v  (NP) 

NP  +  1 

be 

have 

►  Y 

# 

1 

Lv  J 

1 

2 

3  4 

5 

6 

A 

II 

II 

II 

II 

1 

2 

3  4  +C  +  5 

6 

7 

3. 

CP  3  Complementizer  Placement  3 

OB 

# 

X 

N  [  NP  +  Y] 

o 

Z 

# 

1 

2 

3  4  5 

6 

7=== 

=> 

1 

2 

3  that  +4  5 

6 

7 

4. 

IE 

Identity  Erasure 

OB 

# 

W 

(NP)  X  aC  NP  Y 

(NP)  Z 

# 

1 

2 

3  4  5  6 

7 

8 

9 

A 

II 

II 

II 

li 

O 

T« 

1 

2 

3  4  5  0 

7 

8 

9 

10 

Condition:  An  NP.  is  erased  by  an  identical  NP.  if  and 

J  i 

only  if  there  is  an  S  such  that 
7  n 

(i)  NP.  is  dominated  by  S 
J  n 


m 


NP.  neither  dominates  nor  is  dominated 
l 

by  S 

for  all  NP^  neither  dominating  nor 
dominated  by  S  ,  the  distance  between 
NPj  and  NP^  is  greater  than  the  dis¬ 
tance  between  NP.  and  NP.  where  dis- 

J  i 

tance  between  two  nodes  is  defined  in 
terms  of  the  number  of  branches  in 
the  path  connecting  them. 


5. 

IOI 

Indirect  Object  Inversion 

OP 

# 

X 

V 

{np}  to+NP  y 

# 

1 

2 

3 

4  5  6 

7= 

===> 

1 

2 

3+5 

4  0  6 

I 

6. 

TO 

To  Deletion 

OB 

# 

X 

V  to 

NP  (PREP)  +  NP 

Y 

# 

1 

2 

3  4 

5  6 

7 

A 

II 

II 

II 

II 

00 

1 

2 

3  0 

5  6 

7 

8 

7. 

PASSIVE 

Passive 

OB 

# 

(PRE)  NP.  AUX  V  (PREP)  NP, 

1  2 

X 

PREP  P 

Y 

# 

1 

2 

3  4 

5  6  7 

8 

9  10  11 

12====> 

1 

2 

7  4 

be+en+5  6  0 

8 

9  3 

11 

12 

Condition: 

3  4  7 

8. 

EXTRA  Extraposition 

OP 

# 

X 

[<+_s>]N  s  Y  * 

1 

2 

3 

4  5  6==== 

> 

1 

2 

3 

0  5  4  +6 

(ii) 

(iii) 
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9.  PROREP  Pronoun  Replacement  OB 


#  X  [[<+_S>JN]Np  (AUX(be  en) V  +  (MAN))  GC 

NP  Y 

# 

12  3 

4 

5 

6  7 

8====> 

1  2  6 

4 

5 

0  7 

8 

10.  WHA 

WH-Attraction 

OB 

#  U  ART 

tnp  w  /PREP  +  rwH  xj  " 

1  W  ([WH  X]Np 

} 

Y]s  Z 

# 

12  3 

4  5  6 

7  8 

9====> 

1  2  3 

6+45  0 

7  8 

9 

11.  RELPLACE  Relative  Placement 

OB 

#  X  ART  S  N  Y  # 

1  2  3 

45  6  7=== 

r> 

1  2  3 

0  5  +  4  6  7 

12.  AUXFIJLL  Auxiliary  Filler 

OB 

#  X  T 

{have}  Y  # 

12  3 

4  5  6=== 

:=> 

1  2  3  + 

4  0  5  6 

13.  AG 

Agreement 

OB 

#  (PRE) 

[(DET)[(«Sg)JNX]Np  j 

(PRESj 

[past/ 

Y 

’  # 

1  2 

3 

4 

5 

6== 

==> 

1  2 

3 

4<cSg 

5 

6 

Condition:  4  <  0 

14.  EVER  Ever  OP 

#  X  INDEF  [< -human >]N  WH  NY# 

1  2  3  4  5  6  7  8  9====> 


1  2 


3 


4 


5  6  +  ever  7  8  9 
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.  REGDEL  Regressive  Deletion  a.  OP 

#  X  WH  +  INDEF  +  if*  0  )  N  Y  # 

jINDEF  N  J  ^b.  every 


1  2 

3 

4  5 

6 

7  8====> 

1  2 

0 

0  5 

6 

7  8 

Condition:  4=6 

16. 

DEFI 

Definitization 

OB 

# 

X 

N 

[(PREP)  +  WH  INDEF  N  Y]_ 

o 

Z  # 

1 

2 

3 

4  5  6  7 

8  9= 

===> 

1 

2 

3 

4  DEF  6  7 

8  9 

17. 

WHAG 

WH -Agreement 

OB 

# 

X 

WH 

VdEFJ  (ever*  [(ahurnan)]N  Y  # 

1 

2 

3 

4  5  6 

7  8= 

===> 

1 

2 

3 

4  <  ahuman  5  6 

7  8 

Condition:  4  <  0 

18. 

PROGDEL  Progressive  Deletion 

OB 

# 

X 

N 

(PREP)  +  WH  +  DEF  NY# 

1 

2 

3 

4  5  6  7= 

====> 

1 

2 

3 

4  0  6  7 

Condition:  3=5 

19. 

RELDEL  Relative  Deletion 

OP 

# 

X 

N 

[WH  Y]Np  +  be  +  PRES  ADJ 

Z  # 

1 

2 

3 

4  5 

6  7== 

==> 

1  2  3 


5  6  7 


OB 


20.  ADJPLACE  Adjective  Placement 


# 

X 

N  ADJ 

Y  # 

1 

2 

3  4 

5  6====> 

1 

2 

4+3  0 

5  6 

21. 

CDUP 

Complementizer  Duplication 

OB 

# 

X 

aC 

NP  Y 

# 

1 

2 

3 

4  5 

6====> 

1 

2 

3 

4+3  5 

6 

22. 

CNEG 

C  Negative  Placement 

OP 

# 

X 

“C  ‘{hive 

■)  +  T  NEG 
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APPENDIX  V 


Sentence  Types  Contained  in  English  I 

1.  the  boy  likes  the  girl 

2.  the  boys  like  the  girl 

3.  the  boy  liked  the  girl 

4.  the  boy  does  not  like  the  girl 

5.  the  boy  will  like  the  girl 

6.  the  boy  would  like  the  girl 

7.  the  boy  will  not  like  the  girl 

8.  the  boy  is  admiring  the  girl 

9.  the  boy  isn't  admiring  the  girl 

10.  the  boy  has  been  admiring  the  girl 

11.  the  boy  will  have  been  admiring  the  girl 

12.  does  the  boy  like  the  girl? 

13.  doesn't  the  boy  like  the  girl? 

14.  John  likes  the  girl  doesn't  he? 

15.  John  does  not  like  Mary  does  he? 

16.  is  John  admiring  Mary? 

17.  the  books  were  purchased  by  John 

18.  must  Mary  be  tormented  by  John? 

19.  John  gave  the  book  to  Mary 

20.  John  offered  Mary  the  book 

21.  the  book  was  offered  to  Mary  by  John 

22.  Mary  was  offered  the  book  by  John 

23.  who  sleeps? 

24.  what  boy  sleeps  ? 

25.  which  things  slip? 

26.  what  slips  ? 


27.  what  book  has  John  not  taken? 

28.  about  what  did  John  speak? 

29.  the  boy  who  must  leave  will  leave 

30.  the  book  of  which  John  speaks  is  awful 

31.  the  book  John  speaks  of  is  awful 

32.  John  touched  that  which  annoys  Bill 

33.  Bill  can  visualize  what  will  fall 

34.  whatever  falls  will  bounce 

35.  a  tall  boy  arrived 

36.  which  tall  boy  did  John  see? 

37.  John  would  like  for  Mary  to  leave 

38.  John  wants  Mary  to  leave 

39.  John  wants  Mary  to  be  loved  by  Bill 

40.  John  prefers  for  Bill  not  to  leave 

41.  Bill  would  prefer  for  John  not  to  have  dreamed 

42.  for  John  not  to  drown  would  be  preferred 

43.  it  is  required  for  John  to  stand 

44.  Bond  was  believed  to  be  dead  by  Goldfinger 

45.  John  loves  to  run 

46.  John  likes  to  be  taken 

47.  John  thinks  Bill  to  be  silly 

48.  John  decided  for  Bill  to  represent  Harry 

49.  John  decided  on  Bill  to  represent  Harry 

50.  John  appears  to  have  fallen 

51.  it  embarrasses  Bill  to  trip 

52.  John  may  resemble  Bill 

53.  John  dislikes  Bill's  annoying  Mary 

54.  John  dislikes  Bill  annoying  Mary 
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55. 

56. 

57. 

58. 


59. 


60. 

61. 

62. 

63. 

64. 

65. 

66. 


John  dislikes  annoying  Mary 

John  decided  on  going 

John  thinks  that  Bill  will  go 

John  thinks  Bill  smokes 

that  Bill  smokes  was  mentioned  by  John 

Bill  mentioned  to  Mary  that  John  smokes 

it  was  mentioned  by  John  that  Bill  smokes 

Bill  tells  Mary  John  smokes 

Bill  reminded  Mary  to  go 

John  tempted  Mary  to  go 

John  condescended  to  go 

John  ntops  wondering 
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